Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Nat Genet ; 55(7): 1243-1249, 2023 07.
Article in English | MEDLINE | ID: mdl-37386248

ABSTRACT

Phasing involves distinguishing the two parentally inherited copies of each chromosome into haplotypes. Here, we introduce SHAPEIT5, a new phasing method that quickly and accurately processes large sequencing datasets and applied it to UK Biobank (UKB) whole-genome and whole-exome sequencing data. We demonstrate that SHAPEIT5 phases rare variants with low switch error rates of below 5% for variants present in just 1 sample out of 100,000. Furthermore, we outline a method for phasing singletons, which, although less precise, constitutes an important step towards future developments. We then demonstrate that the use of UKB as a reference panel improves the accuracy of genotype imputation, which is even more pronounced when phased with SHAPEIT5 compared with other methods. Finally, we screen the UKB data for loss-of-function compound heterozygous events and identify 549 genes where both gene copies are knocked out. These genes complement current knowledge of gene essentiality in the human genome.


Subject(s)
Biological Specimen Banks , Genome, Human , Humans , Exome Sequencing , Sequence Analysis, DNA/methods , Genotype , Haplotypes , Genome, Human/genetics , United Kingdom , Polymorphism, Single Nucleotide/genetics
2.
Nat Genet ; 55(7): 1088-1090, 2023 07.
Article in English | MEDLINE | ID: mdl-37386250

ABSTRACT

The release of 150,119 UK Biobank sequences represents an unprecedented opportunity as a reference panel to impute low-coverage whole-genome sequencing data with high accuracy but current methods cannot cope with the size of the data. Here we introduce GLIMPSE2, a low-coverage whole-genome sequencing imputation method that scales sublinearly in both the number of samples and markers, achieving efficient whole-genome imputation from the UK Biobank reference panel while retaining high accuracy for ancient and modern genomes, particularly at rare variants and for very low-coverage samples.


Subject(s)
Biological Specimen Banks , Polymorphism, Single Nucleotide , Gene Frequency , Polymorphism, Single Nucleotide/genetics , Genome , United Kingdom , Genotype
3.
Nat Commun ; 13(1): 6668, 2022 11 05.
Article in English | MEDLINE | ID: mdl-36335127

ABSTRACT

Identical genetic variations can have different phenotypic effects depending on their parent of origin. Yet, studies focusing on parent-of-origin effects have been limited in terms of sample size due to the lack of parental genomes or known genealogies. We propose a probabilistic approach to infer the parent-of-origin of individual alleles that does not require parental genomes nor prior knowledge of genealogy. Our model uses Identity-By-Descent sharing with second- and third-degree relatives to assign alleles to parental groups and leverages chromosome X data in males to distinguish maternal from paternal groups. We combine this with robust haplotype inference and haploid imputation to infer the parent-of-origin for 26,393 UK Biobank individuals. We screen 99 phenotypes for parent-of-origin effects and replicate the discoveries of 6 GWAS studies, confirming signals on body mass index, type 2 diabetes, standing height and multiple blood biomarkers, including the known maternal effect at the MEG3/DLK1 locus on platelet phenotypes. We also report a novel maternal effect at the TERT gene on telomere length, thereby providing new insights on the heritability of this phenotype. All our summary statistics are publicly available to help the community to better characterize the molecular mechanisms leading to parent-of-origin effects and their implications for human health.


Subject(s)
Diabetes Mellitus, Type 2 , Humans , Male , Alleles , Biological Specimen Banks , Genome-Wide Association Study , Phenotype , Female
4.
Nat Commun ; 12(1): 4842, 2021 08 10.
Article in English | MEDLINE | ID: mdl-34376650

ABSTRACT

Nearby genes are often expressed as a group. Yet, the prevalence, molecular mechanisms and genetic control of local gene co-expression are far from being understood. Here, by leveraging gene expression measurements across 49 human tissues and hundreds of individuals, we find that local gene co-expression occurs in 13% to 53% of genes per tissue. By integrating various molecular assays (e.g. ChIP-seq and Hi-C), we estimate the ability of several mechanisms, such as enhancer-gene interactions, in distinguishing gene pairs that are co-expressed from those that are not. Notably, we identify 32,636 expression quantitative trait loci (eQTLs) which associate with co-expressed gene pairs and often overlap enhancer regions. Due to affecting several genes, these eQTLs are more often associated with multiple human traits than other eQTLs. Our study paves the way to comprehend trait pleiotropy and functional interpretation of QTL and GWAS findings. All local gene co-expression identified here is available through a public database ( https://glcoex.unil.ch/ ).


Subject(s)
Gene Expression Regulation , Genetic Pleiotropy/genetics , Genome, Human/genetics , Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide , Quantitative Trait Loci/genetics , Binding Sites/genetics , Gene Ontology , Genetic Association Studies/methods , Humans , Regulatory Sequences, Nucleic Acid/genetics , Transcription Factors/metabolism
6.
Nat Genet ; 53(1): 120-126, 2021 01.
Article in English | MEDLINE | ID: mdl-33414550

ABSTRACT

Low-coverage whole-genome sequencing followed by imputation has been proposed as a cost-effective genotyping approach for disease and population genetics studies. However, its competitiveness against SNP arrays is undermined because current imputation methods are computationally expensive and unable to leverage large reference panels. Here, we describe a method, GLIMPSE, for phasing and imputation of low-coverage sequencing datasets from modern reference panels. We demonstrate its remarkable performance across different coverages and human populations. GLIMPSE achieves imputation of a genome for less than US$1 in computational cost, considerably outperforming other methods and improving imputation accuracy over the full allele frequency range. As a proof of concept, we show that 1× coverage enables effective gene expression association studies and outperforms dense SNP arrays in rare variant burden tests. Overall, this study illustrates the promising potential of low-coverage imputation and suggests a paradigm shift in the design of future genomic studies.


Subject(s)
Sequence Analysis, DNA , Genome, Human , Genotype , Humans , Likelihood Functions , Polymorphism, Single Nucleotide/genetics , Reference Standards
SELECTION OF CITATIONS
SEARCH DETAIL
...